Transliteration Considering Context Information based on the Maximum Entropy Method
نویسندگان
چکیده
This paper proposes a method of automatic transliteration from English to Japanese words. Our method successfully transliterates an English word not registered in any bilingual or pronunciation dictionaries by converting each partial letters in the English word into Japanese katakana characters. In such transliteration, identical letters occurring in different English words must often be converted into different katakana. To produce an adequate transliteration, the proposed method considers chunking of alphabetic letters of an English word into conversion units and considers English and Japanese context information simultaneously to calculate the plausibility of conversion. We have confirmed experimentally that the proposed method improves the conversion accuracy by 63% compared to a simple method that ignores the plausibility of chunking and contextual information.
منابع مشابه
Hypothesis Selection in Machine Transliteration: A Web Mining Approach
We propose a new method of selecting hypotheses for machine transliteration. We generate a set of Chinese, Japanese, and Korean transliteration hypotheses for a given English word. We then use the set of transliteration hypotheses as a guide to finding relevant Web pages and mining contextual information for the transliteration hypotheses from the Web page. Finally, we use the mined information...
متن کاملNamed Entity Translation with Web Mining and Transliteration
This paper presents a novel approach to improve the named entity translation by combining a transliteration approach with web mining, using web information as a source to complement transliteration, and using transliteration information to guide and enhance web mining. A Maximum Entropy model is employed to rank translation candidates by combining pronunciation similarity and bilingual contextu...
متن کاملMachine Transliteration Using Multiple Transliteration Engines and Hypothesis Re-Ranking
This paper describes a novel method of improving machine transliteration by using multiple transliteration hypotheses and re-ranking them. We constructed seven machine-transliteration engines to produce a set of transliteration hypotheses. We then re-ranked the hypotheses to select the correct transliteration hypothesis. We propose a re-ranking method that makes use of confidence-score, languag...
متن کاملEvaluation of monitoring network density using discrete entropy theory
The regional evaluation of monitoring stations for water resources can be of great importance due to its role in finding appropriate locations for stations, the maximum gathering of useful information and preventing the accumulation of unnecessary information and ultimately reducing the cost of data collection. Based on the theory of discrete entropy, this study analyzes the density of rain gag...
متن کاملThe Amirkabir Machine Transliteration System for NEWS 2011: Farsi-to-English Task
In this paper we describe the statistical machine transliteration system of Amirkabir University of Technology, developed for NEWS 2011 shared task. This year we participated in English to Persian language pair. We use three systems for transliteration: the first system is a maximum entropy model with a new proposed alignment algorithm. The second system is Sequitur g2p tool, an open source gra...
متن کامل